Extending the Relational Algebra with the Mapper Operator

نویسندگان

  • Paulo Carreira
  • Antónia Lopes
  • Helena Galhardas
  • João Pereira
چکیده

Application scenarios such as legacy data migration, Extract-TransformLoad (ETL) processes, and data cleaning require the transformation of input tuples into output tuples. Traditional approaches for implementing these data transformations enclose solutions as Persistent Stored Modules (PSM) executed by an RDBMS or transformation code using a commercial ETL tool. Neither of these is easily maintainable or optimizable. A third approach consists of combining SQL queries with external code, written in a programming language. However, this solution is not expressive enough to specify an important class of data transformations that produce several output tuples for a single input tuple. In this paper, we propose the data mapper operator as an extension to the relational algebra to address this class of data transformations. Furthermore, we supply a set of algebraic rewriting rules for optimizing expressions that combine standard relational operators with mappers. Finally, experimental results report the benefits brought by some of the proposed semantic optimizations.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Extending Relational Algebra to express one-to-many data transformations

Application scenarios such as legacy-data migration, ETL processes, data cleaning and data-integration require the transformation of input tuples into output tuples. Traditional approaches for implementing these data transformations enclose solutions as Persistent Stored Modules (PSM) executed by an RDBMS or transformation code using a commercial ETL tool. Neither of these solutions is easily m...

متن کامل

One-to-many data transformations through data mappers

The optimization capabilities of RDBMSs are turning them attractive for executing data transformations. However, despite the fact that many useful data transformations can be expressed as relational queries, an important class of data transformations that produce several output tuples for a single input tuple cannot be expressed in that way. To overcome this limitation, we propose to extend Rel...

متن کامل

Data Mapper: An Operator for Expressing One-to-Many Data Transformations

Transforming data is a fundamental operation in application scenarios involving data integration, legacy data migration, data cleaning, and extract-transform-load processes. Data transformations are often implemented as relational queries that aim at leveraging the optimization capabilities of most RDBMSs. However, relational query languages like SQL are not expressive enough to specify an impo...

متن کامل

Repetitions and permutations of columns in the semijoin algebra

Codd defined the relational algebra [3, 4] as the algebra with operations projection, join, restriction, union and difference. His projection operator can drop, permute and repeat columns of a relation. This permuting and repeating of columns does not really add expressive power to the relational algebra. Indeed, using the join operation, one can rewrite any relational algebra expression into a...

متن کامل

Relational Approach to XPath Query Optimization

This thesis contributes to the Pathfinder project which aims at creating an XQuery compiler on top of a relational database system. Currently, it is being implemented on top of MonetDB, a main memory database system. For optimization and portability purposes, Pathfinder first compiles an XQuery expression into its own relational algebra, before translating the query into the query language of t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005